Rate my Hydrograph: Evaluating the Conformity of Expert Judgment and Quantitative Metrics

Oral presentation at the EGU General Assembly 2022 on a social study to compare expert rankings of simulated hydrographs with quantitative metrics.

Abstract

As hydrologists, we pride ourselves on being able to identify deficiencies of a hydrologic model by looking at its runoff simulations. Generally, one of the first questions that a practicing hydrologist always asks when presented with a new model is: “show me some hydrographs!”. Everyone has an intuition about how a “real” (i.e., observed) hydrograph should behave [1, 2]. Although there exists a large suite of summary metrics that measure differences between simulated and observed hydrographs, those metrics do not always fully account for our professional intuition about what constitutes an adequate hydrological prediction (perhaps because metrics typically aggregate over many aspects of model performance). To us, this suggests that either (a) there is potential to improve existing metrics to conform better with expert intuition, or (b) our expert intuition is overvalued and we should focus more on metrics, or (c) a bit of both.

In the social study proposed here, we aim to address this issue in a data-driven fashion: We will ask experts to access a website where they are tasked to compare two unlabeled hydrographs (at the same time) against an observed hydrograph, and to decide which of the unlabeled ones they think matches the observations better. Together with information about the experts’ background expertise, the collected responses should help paint a more nuanced picture of the aspects of hydrograph behavior that different members of the community consider important. This should provide valuable information that may enable us to derive new (and hopefully better) model performance metrics in a data-driven fashion directly from human ratings.

[1] Crochemore, Louise, et al. “Comparing expert judgement and numerical criteria for hydrograph evaluation.” Hydrological sciences journal 60.3 (2015): 402-423.

[2] Wesemann, Johannes, et al. “Man vs. Machine: An interactive poll to evaluate hydrological model performance of a manual and an automatic calibration.” EGU General Assembly Conference Abstracts. 2017.

Abstract: Gauch, M., Kratzert, F., Mai, J., Tolson, B., Nearing, G., Gupta, H., Hochreiter, S., and Klotz, D.: Rate my Hydrograph: Evaluating the Conformity of Expert Judgment and Quantitative Metrics, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8396, https://doi.org/10.5194/egusphere-egu22-8396, 2022.

Citation

@inproceedings{gauch2022egu,
  title={Rate my Hydrograph: Evaluating the Conformity of Expert Judgment and Quantitative Metrics},
  author={Gauch, Martin and Kratzert, Frederik and Mai, Juliane and Tolson, Bryan and Nearing, Grey and Gupta, Hoshin and Hochreiter, Sepp and Klotz, Daniel},
  booktitle={EGU General Assembly 2022},
  venue={online},
  date={23--27 May},
  year={2022},
  doi={10.5194/egusphere-egu22-8396}
}